Skip to content

dns: Add is_hostname() for RFC 1123 §2.1 Internet hostname validation#2346

Open
vtushar06 wants to merge 2 commits intosourcemeta:mainfrom
vtushar06:add-hostname-validation
Open

dns: Add is_hostname() for RFC 1123 §2.1 Internet hostname validation#2346
vtushar06 wants to merge 2 commits intosourcemeta:mainfrom
vtushar06:add-hostname-validation

Conversation

@vtushar06
Copy link
Copy Markdown

Description:

Adds sourcemeta::core::is_hostname() in a new src/core/dns module following the same pattern as src/core/ip and src/core/time.

This is a building block for format-assertion support in Blaze for Draft 4 and Draft 6.

Function signature:
auto is_hostname(std::string_view value) -> bool;

Pure string_view state machine followed by No heap allocations. No regex. No external deps.

RFC 1123 §2.1 compliance:

  • First char: letter or digit (§2.1 relaxation)
  • Label length: 1-63 chars (§2.1 MUST)
  • Total length: 1-255 chars (§2.1 SHOULD)
  • Rejects trailing dot, leading dot, double dot
  • Rejects labels starting or ending with hyphen
  • Rejects underscore and non-ASCII bytes

One deliberate test suite divergence:
Test #20 in Group 1 (draft7+) marks XN--aa---o47jg78q as invalid citing RFC 5891 §4.2.3.1 (IDNA2008). RFC 1123 has no such rule. draft 4's test suite marks xn--4gbwdl.xn--wgbh1c (same structural pattern) as valid. Our
implementation accepts XN--aa---o47jg78q (spec-faithful per RFC 1123 §2.1).

One bug fixed vs ajv-formats and python-jsonschema:
Both accept example. (trailing dot) via regex \\.?$ matching. Our state machine rejects it by construction.

Expected test suite results:

draft4: 27/27 pass (no A-label group)
draft6: 27/27 pass (no A-label group)
draft7/2019-09/2020-12/v1: 23/61 pass (Group 2 is IDNA2008 - out of scope for hostname format per JSON Schema spec)

Tests: 44 cases (21 valid, 23 invalid)

Out of scope for this PR:

  • Blaze integration (separate follow-up PR)
  • idn-hostname format (needs IDNA2008 library)

Copilot AI review requested due to automatic review settings April 11, 2026 08:08
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new DNS core module that provides RFC 1123 §2.1 (JSON Schema hostname) Internet hostname validation, plus a dedicated unit test suite and CMake wiring consistent with existing core/ip and core/time modules.

Changes:

  • Introduce sourcemeta::core::is_hostname(std::string_view) -> bool implemented as a no-allocation ASCII state machine.
  • Add comprehensive hostname format unit tests (valid/invalid cases including length limits and ASCII-only enforcement).
  • Wire the new core/dns library and its tests into the top-level CMake options/build.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/core/dns/include/sourcemeta/core/dns.h Public API for is_hostname() with Doxygen docs.
src/core/dns/hostname.cc Implements RFC 1123 §2.1 hostname validation via a state machine.
src/core/dns/CMakeLists.txt Defines the new sourcemeta::core::dns library target.
test/dns/hostname_test.cc Adds unit tests covering valid/invalid hostname inputs and edge cases.
test/dns/CMakeLists.txt Adds a dns unit test target linked to sourcemeta::core::dns.
CMakeLists.txt Adds SOURCEMETA_CORE_DNS option and includes module/tests subdirectories.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 6 files

@vtushar06 vtushar06 force-pushed the add-hostname-validation branch 2 times, most recently from 2f8cab5 to f5747ca Compare April 11, 2026 08:54
Adds sourcemeta::core::is_hostname() in a new src/core/dns module
following the same pattern as src/core/ip and src/core/time.

- Pure string_view state machine, no heap allocations
- Validates hostname per RFC 1123 §2.1 + RFC 952 grammar
- First char: letter or digit (RFC 1123 §2.1 relaxation of RFC 952)
- Label length: 1-63 chars (RFC 1123 §2.1 MUST)
- Total length: 1-255 chars (RFC 1123 §2.1 SHOULD)
- Rejects trailing dot, leading dot, consecutive dots
- Rejects labels starting or ending with hyphen
- Rejects underscore and non-ASCII bytes
- Accepts XN--aa---o47jg78q (RFC 1123 has no positions-3-4 rule;
  test suite cites RFC 5891 which is IDNA2008, not RFC 1123)
- 44 unit tests (21 valid, 23 invalid)

draft4 and draft6: expected 27/27 pass (no A-label group)
draft7+: expected 23/61 pass (Group 2 is IDNA2008, out of scope)

Relates to format-assertion support for Draft 4 and Draft 6.

Signed-off-by: Tushar Verma <tusharmyself06@gmail.com>
- Apply clang-format (LLVM style) to four code blocks that violated
  line-length / alignment rules under --dry-run -Werror:
    valid_label_exactly_63: collapse two-line EXPECT_TRUE to one line
    invalid_label_64: collapse two-line EXPECT_FALSE to one line
    invalid_fullwidth_dot: split adjacent string literal across two lines
    invalid_high_bit_byte / invalid_nul_byte: align string_view
      initialiser list per LLVM column rules
- Update comment on invalid_empty: replace "<name> requires at least
  one <let>" with "<hname> requires at least one <name> / label"
  (Copilot review: original phrasing was inaccurate given the RFC 1123
  §2.1 relaxation that allows digit-first labels)

All 44 tests still pass.

Signed-off-by: Tushar Verma <tusharmyself06@gmail.com>
@vtushar06 vtushar06 force-pushed the add-hostname-validation branch from f5747ca to d217b3b Compare April 11, 2026 08:56
@vtushar06
Copy link
Copy Markdown
Author

hey @jviotti PR is ready for review, let me know once you are done with review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants